121 research outputs found
Confidentiality-Preserving Publish/Subscribe: A Survey
Publish/subscribe (pub/sub) is an attractive communication paradigm for
large-scale distributed applications running across multiple administrative
domains. Pub/sub allows event-based information dissemination based on
constraints on the nature of the data rather than on pre-established
communication channels. It is a natural fit for deployment in untrusted
environments such as public clouds linking applications across multiple sites.
However, pub/sub in untrusted environments lead to major confidentiality
concerns stemming from the content-centric nature of the communications. This
survey classifies and analyzes different approaches to confidentiality
preservation for pub/sub, from applications of trust and access control models
to novel encryption techniques. It provides an overview of the current
challenges posed by confidentiality concerns and points to future research
directions in this promising field
Reliable Messaging to Millions of Users with MigratoryData
Web-based notification services are used by a large range of businesses to
selectively distribute live updates to customers, following the
publish/subscribe (pub/sub) model. Typical deployments can involve millions of
subscribers expecting ordering and delivery guarantees together with low
latencies. Notification services must be vertically and horizontally scalable,
and adopt replication to provide a reliable service. We report our experience
building and operating MigratoryData, a highly-scalable notification service.
We discuss the typical requirements of MigratoryData customers, and describe
the architecture and design of the service, focusing on scalability and fault
tolerance. Our evaluation demonstrates the ability of MigratoryData to handle
millions of concurrent connections and support a reliable notification service
despite server failures and network disconnections
StreamBed: capacity planning for stream processing
StreamBed is a capacity planning system for stream processing. It predicts,
ahead of any production deployment, the resources that a query will require to
process an incoming data rate sustainably, and the appropriate configuration of
these resources. StreamBed builds a capacity planning model by piloting a
series of runs of the target query in a small-scale, controlled testbed. We
implement StreamBed for the popular Flink DSP engine. Our evaluation with
large-scale queries of the Nexmark benchmark demonstrates that StreamBed can
effectively and accurately predict capacity requirements for jobs spanning more
than 1,000 cores using a testbed of only 48 cores.Comment: 14 pages, 11 figures. This project has been funded by the Walloon
region (Belgium) through the Win2Wal project GEPICIA
Peer to peer multidimensional overlays: Approximating complex structures
Peer to peer overlay networks have proven to be a good support for storing and retrieving data in a fully decentralized way. A sound approach is to structure them in such a way that they reflect the structure of the application. Peers represent objects of the application so that neighbours in the peer to peer network are objects having similar characteristics from the application's point of view. Such structured peer to peer overlay networks provide a natural support for range queries. While some complex structures such as a Voronoï tessellation, where each peer is associated to a cell in the space, are clearly relevant to structure the objects, the associated cost to compute and maintain these structures is usually extremely high for dimensions larger than 2. We argue that an approximation of a complex structure is enough to provide a native support of range queries. This stems fromthe fact that neighbours are importantwhile the exact space partitioning associated to a given peer is not as crucial. In this paper we present the design, analysis and evaluation of RayNet, a loosely structured Voronoï-based overlay network. RayNet organizes peers in an approximation of a Voronoï tessellation in a fully decentralized way. It relies on a Monte-Carlo algorithm to estimate the size of a cell and on an epidemic protocol to discover neighbours. In order to ensure efficient (polylogarithmic) routing, RayNet is inspired from the Kleinberg's small world model where each peer gets connected to close neighbours (its approximate Voronoï neighbours in Raynet) and shortcuts, long range neighbours, implemented using an existing Kleinberg-like peer sampling
LayStream: composing standard gossip protocols for live video streaming
Gossip-based live streaming is a popular topic, as attested by the vast literature on the subject. Despite the particular merits of each proposal, all need to implement and deal with common challenges such as membership management, topology construction and video packets dissemination. Well-principled gossip-based protocols have been proposed in the literature for each of these aspects. Our goal is to assess the feasibility of building a live streaming system, \sys, as a composition of these existing protocols, to deploy the resulting system on real testbeds, and report on lessons learned in the process. Unlike previous evaluations conducted by simulations and considering each protocol independently, we use real deployments. We evaluate protocols both independently and as a layered composition, and unearth specific problems and challenges associated with deployment and composition. We discuss and present solutions for these, such as a novel topology construction mechanism able to cope with the specificities of a large-scale and delay-sensitive environment, but also with requirements from the upper layer. Our implementation and data are openly available to support experimental reproducibility
Slead: low-memory, steady distributed systems slicing
Slicing a large-scale distributed system is the process of autonomously partitioning its nodes into k groups, named slices. Slicing is associated to an order on node-specific criteria, such as available storage, uptime, or bandwidth. Each slice corresponds to the nodes between two quantiles in a virtual ranking according to the criteria.
For instance, a system can be split in three groups, one with nodes with the lowest uptimes, one with nodes with the highest uptimes, and one in the middle. Such a partitioning can be used by applications to assign different tasks to different groups of nodes, e.g., assigning critical tasks to the more powerful or stable nodes and less critical tasks to other slices.
Assigning a slice to each node in a large-scale distributed system, where no global knowledge of nodes’ criteria exists, is not trivial. Recently, much research effort was dedicated to guaranteeing a fast and correct convergence in comparison to a global sort of the nodes.
Unfortunately, state-of-the-art slicing protocols exhibit flaws that preclude their application in real scenarios, in particular with respect to cost and stability. In this paper, we identify steadiness issues where nodes in a slice border constantly exchange slice and large memory requirements for adequate convergence, and provide practical solutions for the two. Our solutions are generic and can be applied to two different state-of-the-art slicing protocols with little effort and while preserving the desirable properties of each. The effectiveness of the proposed solutions is extensively studied in several simulated experiments.(undefined
On using micro-clouds to deliver the fog
The cloud is scalable and cost-efficient, but it is not ideal for hosting all applications. Fog computing proposes an alternative of offloading some computation to the edge. Which applications to offload, where to, and when is not entirely clear yet due to our lack of understanding of potential edge infrastructures. Through a number of experiments, we showcase the feasibility and readiness of micro-clouds formed by collections of Raspberry Pis to host a range of fog applications, particularly for network-constrained environments
A methodology for tenant migration in legacy shared-table multi-tenant applications
International audienceMulti-tenancy enables cost-effective SaaS through resource consolidation. Multiple customers, or tenants, are served by a single application instance, and isolation is enforced at the application level. Service load for different tenants can vary over time, requiring applications to scale in and out. A large class of SaaS providers operates legacy applications structured around a relational (SQL) database. These applications achieve tenant isolation through dedicated fields in their relational schema and are not designed to support scaling operations. We present a novel solution for scaling in or out such applications through the migration of a tenant's data to new application and database instances. Our solution requires no change to the application and incurs no service downtime for non-migrated tenants. It leverages external tables and foreign data wrappers, as supported by major relational databases. We evaluate the approach using two multi-tenant applications: Iomad, an extension of the Moodle Learning Management System, and Camunda, a business process management platform. Our results show the usability of the method, minimally impacting performance for other tenants during migration and leading to increased service capacity after migration
Content Censorship in the InterPlanetary File System
The InterPlanetary File System (IPFS) is currently the largest decentralized
storage solution in operation, with thousands of active participants and
millions of daily content transfers. IPFS is used as remote data storage for
numerous blockchain-based smart contracts, Non-Fungible Tokens (NFT), and
decentralized applications.
We present a content censorship attack that can be executed with minimal
effort and cost, and that prevents the retrieval of any chosen content in the
IPFS network. The attack exploits a conceptual issue in a core component of
IPFS, the Kademlia Distributed Hash Table (DHT), which is used to resolve
content IDs to peer addresses. We provide efficient detection and mitigation
mechanisms for this vulnerability. Our mechanisms achieve a 99.6\% detection
rate and mitigate 100\% of the detected attacks with minimal signaling and
computational overhead. We followed responsible disclosure procedures, and our
countermeasures are scheduled for deployment in the future versions of IPFS.Comment: 15 pages (including references), 15 figures. Accepted to be published
at the Network and Distributed System Security (NDSS) Symposium 202
- …